Picture for James M. Rehg

James M. Rehg

Toward Cognitive Supersensing in Multimodal Large Language Model

Add code
Feb 02, 2026
Viaarxiv icon

How Much 3D Do Video Foundation Models Encode?

Add code
Dec 23, 2025
Viaarxiv icon

Improving Personalized Search with Regularized Low-Rank Parameter Updates

Add code
Jun 11, 2025
Viaarxiv icon

LSM-2: Learning from Incomplete Wearable Sensor Data

Add code
Jun 05, 2025
Figure 1 for LSM-2: Learning from Incomplete Wearable Sensor Data
Figure 2 for LSM-2: Learning from Incomplete Wearable Sensor Data
Figure 3 for LSM-2: Learning from Incomplete Wearable Sensor Data
Figure 4 for LSM-2: Learning from Incomplete Wearable Sensor Data
Viaarxiv icon

Incorporating Flexible Image Conditioning into Text-to-Video Diffusion Models without Training

Add code
May 27, 2025
Figure 1 for Incorporating Flexible Image Conditioning into Text-to-Video Diffusion Models without Training
Figure 2 for Incorporating Flexible Image Conditioning into Text-to-Video Diffusion Models without Training
Figure 3 for Incorporating Flexible Image Conditioning into Text-to-Video Diffusion Models without Training
Figure 4 for Incorporating Flexible Image Conditioning into Text-to-Video Diffusion Models without Training
Viaarxiv icon

MEBench: A Novel Benchmark for Understanding Mutual Exclusivity Bias in Vision-Language Models

Add code
May 26, 2025
Figure 1 for MEBench: A Novel Benchmark for Understanding Mutual Exclusivity Bias in Vision-Language Models
Figure 2 for MEBench: A Novel Benchmark for Understanding Mutual Exclusivity Bias in Vision-Language Models
Figure 3 for MEBench: A Novel Benchmark for Understanding Mutual Exclusivity Bias in Vision-Language Models
Figure 4 for MEBench: A Novel Benchmark for Understanding Mutual Exclusivity Bias in Vision-Language Models
Viaarxiv icon

ShotAdapter: Text-to-Multi-Shot Video Generation with Diffusion Models

Add code
May 12, 2025
Viaarxiv icon

SocialGesture: Delving into Multi-person Gesture Understanding

Add code
Apr 03, 2025
Viaarxiv icon

Learning Predictive Visuomotor Coordination

Add code
Mar 30, 2025
Viaarxiv icon

Towards Online Multi-Modal Social Interaction Understanding

Add code
Mar 25, 2025
Figure 1 for Towards Online Multi-Modal Social Interaction Understanding
Figure 2 for Towards Online Multi-Modal Social Interaction Understanding
Figure 3 for Towards Online Multi-Modal Social Interaction Understanding
Figure 4 for Towards Online Multi-Modal Social Interaction Understanding
Viaarxiv icon